neural path feature
Neural Path Features and Neural Path Kernel: Understanding the role of gates in deep learning Chandrashekar Lakshminarayanan and Amit Vikram Singh
A deep neural network (DNN) with ReLU activations has many gates, and the on/off status of each gate changes across input examples as well as network weights. For a given input example, only a subset of gates are active, i.e., on, and the sub-network of weights connected to these active gates is responsible for producing
- Asia > India (0.14)
- North America > Canada (0.04)
Review for NeurIPS paper: Neural Path Features and Neural Path Kernel : Understanding the role of gates in deep learning
The paper has been well received by all reviews and I find the contribution very interesting as it tries to shed some light to better understand the role of gating in DNNs. The mathematical formulations are strong and detailed as well, which is a plus. There are few typos and some details with the text that should be fixed, the work of Fiat et al. should be clearly cited as well. Some better explanation of the ablation seems that would have been desirable, although I leave to the author to consider whether space allows for better wording.
Neural Path Features and Neural Path Kernel : Understanding the role of gates in deep learning
Rectified linear unit (ReLU) activations can also be thought of as'gates', which, either pass or stop their pre-activation input when they are'on' (when the pre-activation input is positive) or'off' (when the pre-activation input is negative) respectively. A deep neural network (DNN) with ReLU activations has many gates, and the on/off status of each gate changes across input examples as well as network weights. For a given input example, only a subset of gates are'active', i.e., on, and the sub-network of weights connected to these active gates is responsible for producing the output. At randomised initialisation, the active sub-network corresponding to a given input example is random. During training, as the weights are learnt, the active sub-networks are also learnt, and could hold valuable information. To this end, we encode the on/off state of the gates for a given input in a novel'neural path feature' (NPF), and the weights of the DNN are encoded in a novel'neural path value' (NPV).